Automatic speech recognition with sparse training data for dysarthric speakers

نویسندگان

  • Phil D. Green
  • James Carmichael
  • Athanassios Hatzis
  • Pam Enderby
  • Mark S. Hawley
  • Mark Parker
چکیده

We describe an unusual ASR application: recognition of command words from severely dysarthric speakers, who have poor control of their articulators. The goal is to allow these clients to control assistive technology by voice. While this is a small vocabulary, speaker-dependent, isolated-word application, the speech material is more variable than normal, and only a small amount of data is available for training. After training a CDHMM recogniser, it is necessary to predict its likely performance without using an independent test set,so that confusable words can be replaced by alternatives. We present a battery of measures of consistency and confusability, based on forced-alignment, which can be used to predict recogniser performance. We show how these measures perform, and how they are presented to the clinicians who are the users of the system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Stage DNN Training for Automatic Recognition of Dysarthric Speech

Incorporating automatic speech recognition (ASR) in individualized speech training applications is becoming more viable thanks to the improved generalization capabilities of neural network-based acoustic models. The main problem in developing applications for dysarthric speech is the relative in-domain data scarcity. Collecting representative amounts of dysarthric speech data is difficult due t...

متن کامل

Maximum Likelihood Linear Regression (MLLR) for ASR Severity Based Adaptation to Help Dysarthric Speakers

Automatic speech recognition (ASR) for dysarthric speakers is one of the most challenging research areas. The lack of corpus for dysarthric speakers makes it even more difficult. The speaker adaptation (SA) is an alternative solution to overcome the lack of dysarthric speech and enhance the performance of ASR. This paper introduces the Severity-based adaptation, using small amount of speech dat...

متن کامل

Voice-based Age and Gender Recognition using Training Generative Sparse Model

Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...

متن کامل

Polynomial dynamic time warping kernel support vector machines for dysarthric speech recognition with sparse training data

This paper describes a new formulation of a polynomial sequence kernel based on dynamic time warping (DTW) for support vector machine (SVM) classification of isolated words given very sparse training data. The words are uttered by dysarthric speakers who suffer from debilitating neurological conditions that make the collection of speech samples a timeconsuming and low-yield process. Data for bu...

متن کامل

Automatic recognition of dutch dysarthric speech: a pilot study

This paper describes a feasibility study into automatic recognition of Dutch dysarthric speech. Recognition experiments with speaker independent and speaker dependent models are compared, for tasks with different perplexities. The results show that speaker dependent speech recognition for dysarthric speakers is very well possible, even for higher perplexity tasks.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003